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ABSTRACT 


This project is a method of recognition in real time that traces the human 
mood itself and to map out the human behaviour traits with their 
physiological features. Recognition of emotion is the phase of human emotion 
identification. In recognising the emotions of others, people vary widely in 
their precision. Usage of this technology is to help people with emotion 
recognition is a relatively nascent research area. In general, if several 
modalities are used in connection, the technology performs better. Most work 
has been performed to date to automate the identification of video facial 
expressions, audio spoken expressions, written text expressions, and 
physiology as measured by wearables. A real-time recognition framework was 
built in this project that traces the very mood of the individual using some of 
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the libraries such as Keras, OpenCV, Tensor flow, SciPy, The Python 


Alternative To Matlab, etc. HOF and SVM were used to tackle the problem of 
recognition. At each point, optical flows calculate the rotation between two 
frames relative to an observer. For image recognition, the Deep Convolutional 
Neural Networks were used. It was concluded that the application of the 


proposed strategy was accurate and effective. 


KEYWORDS: Emotions, Automatic Recognition, Facial expression, Keras and Open 


CV HOF and SVM, Optical Flow, DCCN 


INTRODUCTION 

Human facial expressions are mainly divided into serval 
emotions. They are basic emotion: sad, happy, surprise, fear, 
disgust, anger and neutral. Our facial emotions can be 
expressed through activation of specific sets of facial 
muscles. These complex signal in an expression often contain 
an abundant amount of information about our state of mind. 
For example, Host of the function may use these metrics to 
evaluate audience interest. Healthcare department can 
provide better service by using additional information about 
patients’ emotional state during treatment. Entertainment 
show producers can monitor audience engagement in events 
to consistently create desired content. Humans are well- 
trained in understanding the others emotions, in fact, at just 
2 year old, babies can already tell the difference between 
happy and sad. But can technology do a better job than us in 
accessing emotional states? To answer the question, we 
designed a deep learning neural network that gives 
machines the ability to make inferences about human 
emotional states. In other words, we try to give them eyes to 
see what we can see. 


The data set which was built by the students and each of 
them are recorded a video expressing all the emotions with 
no directions or instructions at all. Some videos have more 
body parts than others. In cases, videos have objects in the 
background even different light setups. We wished this to be 
as general as possible with no restrictions at all, so it could 
be a very good indicator of our main objective. The code 
detectfaces.py just spot the faces from the video and we 
saved this video in the dimension 240x320. Using this 
algorithm creates shaky videos. Thus we then stabilized all 
videos. This can be done via a code or online free stabilizers 
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are also available. After we stabilized videos and ran it 
through codeemotionface.py. in this code we developed a 
way to extract features based on histogram of dense optical 
flows (HOF) and we used a support vector machine (SVM) 
classifier to tackle the recognition problem. For each video at 
each and every frame we extracted optical flows. Optical 
flows measure the motion relative at every observer 
between two frames at each point of them. Therefore, at each 
point in the image you will have two values that describe the 
vector representing the motion between the two frames: the 
magnitude and the angle. In this case, every videos is have a 
resolution of 240x320, each frame will have a feature 
descriptor of dimensions 240x320x2. 


So, the last video descriptor will have a dimension of 
#framesx240x320x2. In order to make this video 
comparable to other inputs (because inputs of different 
length will not be comparable with each other), we need to 
somehow find a way to summarize the video into a single 
descriptor. We achieve conclusion of the video by calculating 
a histogram of the optical flows. This is, separate the 
extracted flows into categories and count the number of 
flows for each category mentioned. In order to obtain more 
details, we split the scene into a grid of s bys bins (10 in this 
case) in record the location of each feature, and then we 
classified the direction of the flow as one of the 8 different 
motion directions considered in this problem. 


After this, we count for each & every direction the number of 
flows occurring in every direction bin. At last, we end up 
with an s bys by 8 bins descriptor per each frame. Now, the 
summarizing step for each& every video could be the 
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average of the histograms in each grid (average pooling 
method) or we could just pick the maximum value of the 
histograms is shown by grid throughout all the frames on a 
video (max pooling For the classification process, we are 
using support vector machine (SVM) with anon linear kernel 
classifier, created (discussed) in class, to recognize the new 
facial expressions. We also have taken into considered a 
Naive Bayes classifier, but it is widely known that svm 
outperforms the last method in the computer vision field. A 
confusion matrix that can be made to plot results better. 


The main purpose of this project is to develop Automatic 
Facial Expression Recognition System which can take human 
facial images containing some expression as input and 
recognize and classify it into five different expression classes 
such as: I. Neutral I. Angry III. Happy IV. Sadness V. Surprise. 


Problem Statement 

The classification of human expressions can be done easily to 
joy, depress, shock, impartial, anxiety, rage and disgust. By 
triggering unique sets of facial muscles, our facial emotions 
are conveyed. An expression can give us a lot more 
description than words in a single statement. The emotion 
detector will help us to recognize emotion which will help us 
to measure the details and services of the viewers. For 
example, to assess consumer demand, merchants can use 
these metrics. By using additional knowledge about the 
mental status of patients during therapy, healthcare 
providers may deliver improved support. In order to reliably 
generate desired content, entertainment producers should 
track audience interest in activities. 


But in accessing emotional conditions, can machines do a 
better job than us? We built a deep learning neural network 
to address the issue, allowing computers the power to 
extract knowledge about our emotional states. A method was 
developed to recognize facial expression consists of the 
following: 

1. To locate face (e.g. from an image; this step is also 
known as face detection) in the picture, 

2. Features from the facial details were then extracted 
from the desired region (e.g. to detect the outer lining of 
the facial composites; this stage is referred to as the 
extraction of the facial feature), 

3. The next step is used to analyse the action of face 
emotion or any differences in the deliverance of the 
features of face and group their details into different 
categories like movement of muscles of face such as 
frowning, anger, pleasure, rage, group of mood such as 
(dis)liking or ambivalence, etc. Experiments have been 
conducted to aim to create an Automatic Recognition 
System and to increase its efficiency for emotion 
detection. 
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Figure 1: Flowchart of Problem formulation of our 
project 


Algorithms Identification of Image Based Face. 

For the identification of facial expressions, numerous 
methodologies have been established. The primary task to be 
done is feature extraction from the given images. To provide 
extraction of functionality after consideration of approaches. 


A. Key Review of Components: 

In most pattern recognition systems, Principal Component 
Analysis (PCA) is a classic instrument used in the 
appearance-based approach to dimensionality reduction and 
feature extraction. The basic PCA technique used in the 
method of face recognition is referred to as the approach to 
your own face. This transforms the face into a small set of 
essential features, the individual faces that are the main 
component of the learning image set. Recognition is obtained 
by projecting a new picture into the subspace of your own 
face. 


B. Part Research Independent: 
The extended version of the main principal components is 
Independent component analysis (ICA). This feature 
extraction approach offers increased performance over the 
main function component analysis. 





@IJTSRD | Unique Paper ID-ITJTSRD38245 | 


Volume - 5 | Issue - 1 


| November-December 2020 Page 1546 


International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470 


C. Extraction for Face Recognition Linear Discriminant 
Analysis dependent function: 

An usual technique used to decrease the dimensionality is 

Linear Discriminant Analysis; which is done by transforming 

the detailings from high to low dimension by removing the 

unwanted features. This application is majorly used for 

machine and deep learning systems. 


D. Function Extraction using Vector Machine Support: 
To extract discriminatory information from the training data, 
the Support Vector Machine (SVM) approach is used. It's a 
form of binary grouping. The support vector machine offers 
improved performance than the extraction of features based 
on key component analysis. Function extraction can be 
achieved using wavelet transformation on the help vector 
machine. A For this method, haar transformation may be the 
most suitable transformation. The Haar is applied on 
qualified positive and negative results. 


E. Transformation of Successive Mean Quantization 
(SMQT): 

Grouping of data and identification can also be carried out 

using a successive mean transformation of quantization and 

Sparse network of window (SNOW) This is the extraction 

function created using the SNOW classifier. 


F. Function extraction based on the Convolutional 
Neural Network with Gabor Filter: 

An integrated classifier tool is convolutional neural 

networks. This uses an image's local receptive fields, their 

shared weights and sub-sampling to isolate and then merge 

characteristics in a distortion-invariant format. The function 

extractor is generated through the learning process. 


CONVOLUTION NEURAL NETWORK (CNN) 

The techniques of deep learning and artificial intelligence are 
evolving as an advancement in the area of facial recognition 
research and its related applications. To learn the pattern of 
data with several stages of feature extraction, these 
approaches add multiple processing layers. A dominant deep 
learning approach has become the Convolutional Neural 
Network for face recognition. 


In accordance with the convolutional network, AlexNet, 
VGGNet, GoogleNet, ResNet network architecture is available 
to maximise system performance. A Statistical model for 
solving the problem of optimization used to classify and 
distinguish images is a convolutionanleural network. It is 
meant to consider developments. The identification of 
patterns requires numerical knowledge that responds to 
images, messages, voice signals, etc. 


CNN is analogous to a general network of neurons. They are 
made up of neurons used for image recognition that have 
weights and biases. The neurons are called convolutional 
neurons in the convolutional neural network. The 
generalised convolutional network layout consists of the 
convolution layer, the pooling layer and the flattening and 
completely convolutional output layer of output. The 
convolutional neuron layer takes the reference picture of the 
dot product. 


The network expresses itself as a single distinguishable 
score function. The neural network convolution operation is 
responsible for searching for the patterns present in images. 


CNN's main parameter is the size of the filter. The pooling 
layer is used to reduce computation, which will decrease the 
Spatial image size, is another important parameter in the 
convolutional network. The most important factor for 
converting input into output via neural networks is the 
activation function. The Sigmoid functions are called the 
activation function. 
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ALGORITHM 

Phase one:- Collection of the image data package. (We use 
the FER2013 archive of thirty five thousand eight hundred 
and eighty seven which was edited prior, forty eight by forty 
eight photographs with grayscales were present each 
labelled with one or the other emotion of the six types i.e. 
rage, disgust, terror, satisfaction, sorrow, surprise, and 


neutrality. 
Phase two __ : Prior processing of image 


Phase three : To detect image of the faces 


Phase four Transformation of detected faces into 
grayscale pictures. 

Phase five _ : This phase assures that image can be inserted 
as a (1, 48,48) numpy array into the input 
layer. 

Phase six — : Passing the numpy array into the layer of 


Convolution2D. 
Phase seven : Feature maps are created by Convolution. 


Phase eight : The MaxPooling2D pooling approach the 
detailing of the diagram by using a two by two 
dimension of window, which includes the 
whole values. 


Phasenine : The reverse and forward propogation of 
neural network was organized on the values of 
pixel during preparation. 

Phaseten _ : For each emotion type, the Softmax function 


expresses itself as a chance. 


This design viewed us the in depth probabilities of each 
facial expression of human and its composition. 


Database: 

Kaggle Facial Expression Recognition Competition was the 
database used for this design. Forty eight by forty eight pixel 
grayscale pictures were used an input source. The facial 
expressions of the images were inserted in a dynamic 
manner so that it can be zoomed in and in each image the 
equal space is utilized. Main target is to classify every 
emotion into one of the different emotion. (0=Angry, 
1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral) 
based on the emotion displayed in the facial expression. 


28,709 examples compose of the instruction package. The 
dataset included three thousand five hundred eighty nine 
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instances of the common test for the leader board; other 
illustrations were also composed for the last set of data to 
state the champion of the adversaries. 


The emotion labels used were: 

> Zero to four five nine three label images were used for 
the expression of furious 

> One to five four seven label images were used for the 
expression of displease 

> Twoto five one two one label images were used for the 
expression of horror 

> Three to eight nine eight nine label images were used for 
the expression of joy. 

> Four to six zero seven seven label images were used for 
the expression of depression 

> Five to four zero zero label images were used for the 
expression of shock 

> Six to six one nine eight label images were used for the 
expression of unbiased or neutral 


= imibd 9 
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Results 
The emotion detector build by the software processed the 
following results of the input applied as the result. 
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Figure 3: Output of the happy emotion detected. 
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Figure 4: Output of the neutral emotion detected 





Figure 7 7a puri of the surprise emotion detected 


Conclusion 

In this research study the behavioural characteristics of 
people were mapped with the physiological features of 
biometric which is indulged in facial expressions. A base 
matching protocol for this system was developed in 
correlation with features of human face by using geometric 
structures for expressions like joy, sorrow, anxiety, rage, 
surprise and neutral. The behavioural component of this 
system, as a property base, relates to the mindset behind 
various expressions. The property base was raised and 
hidden features were exposed in terms of algorithmic genes. 
In the field of biometric protection, the gene training 
collection tests the expressive individuality of human faces 
and gives us an efficient design. Invention of the modern 
biometric-based asymmetric cryptosystem with features 
such as hierarchical community authentication removes the 
use of password& smartcards as opposed to earlier 
cryptosystems. 


For all other biometrics systems, it needs special hardware 
support. This study gives us an insight to a new way to an 
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attractive field of asymmetric biometric cryptosystems in 
order to overpass the problems related to security codes and 
smart cards. The analysis of the data from the research 
reveals physiological characteristics are effectively identified 
by hierarchical safety structures in geometric shape 
identification. 


Future Work:- 

A neural network is a complex structure there is no existence 
of a simple form for its operation. Each problem requires an 
unique construction of network links and many trails and 
miscalculations to receive the desired precision. So the 
existence of Black Box Algorithms as neural nets developed. 


An accuracy of almost 70 percent was obtained from the 
project which was concluded to be a better model when 
compared to the earlier architecture design models. In 
specific areas; however, we need to change, such as— 

> Amount of convolutional layers and configuration 

> Number and setup of dense layers. 

> Percentage dropout in dense layers 


But we could not go further into dense neural networks 
because of the lack ofa highly configured system as they are 
slower to progress in this stream when we try to pursue it in 
the coming time. 


In order to make the model more and more precise, it was 
trained to insert more quantity of database sets in system, 
but was nota success because of the obstructions which was 
faced due to the resources, so work must be conducted in 
order to minimize the miscalculations for the future work 
and to increase its efficiency. 


With the exploration of new methods it is becoming easy to 
adjust to detect the variances in the facial expressions, which 
will lead us to know the patterns of problems of face and 
detect it in depth and classify it. The optimal fusion of colour 
was dealt with more details to explore it in near future. 
Further analysis can be carried out in the direction of the 
gene allele corresponding to the geometric variables of facial 
expressions. To complete the needs of different security 
models such as crime detection records, governmental 
confidential security breaches, etc the evolution of facial 
pattern structure was meant to be studied with relation to 
their genetic characteristics. 
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